Predictability of intelligence and age from structural connectomes

In this study, structural images of 1048 healthy subjects from the Human Connectome Project Young Adult study and 94 from ADNI-3 study were processed by an in-house tractography pipeline and analyzed together with pre-processed data of the same subjects from braingraph.org. Whole brain structural connectome features were used to build a simple correlation-based regression machine learning model to predict intelligence and age of healthy subjects. Our results showed that different forms of intelligence as well as age are predictable to a certain degree from diffusion tensor imaging detecting anatomical fiber tracts in the living human brain. Though we did not identify significant differences in the prediction capability for the investigated features depending on the imaging feature extraction method, we did find that crystallized intelligence was consistently better predictable than fluid intelligence from structural connectivity data through all datasets. Our findings suggest a practical and scalable processing and analysis framework to explore broader research topics employing brain MR imaging.


Introduction
The connectome-the entire map of neural connections-uniquely represents every subject's gender, age and intelligence like a fingerprint [1].Intelligence is known to be affected by, e.g., topological properties of brain networks such as characteristic path length and global network efficiency, respectively [2,3].The association between a lower characteristic path length and IQ has also been described for resting state functional MR imaging (rs-fMRI) networks [4].Predicting not only gender [5,6] and age [7,8] but also different forms of intelligence [5,6,[9][10][11] in individual subjects made significant progress by using rs-fMRI.However, far less is known to what degree the underlying structural connectome, the backbone of the functional interactions, is also predictive of age and intelligence in cognitively normal adults.Several studies have tested the predictability of brain age using more advanced machine learning models.For example, Lin et al. predicted older individuals' age using artificial neural networks [12] and Taoudi-Benchekroun et al. used deep neural networks and random forests to predict the age of infants [13].
Previous studies suggest a distinct nature with normal aging between crystallized and fluid intelligence [14].Fluid intelligence shows one's ability to acquire new knowledge and is reflected in problem-solving and adaptation to unknown environments therefore it examines cognitive tasks such as cognitive flexibility, working memory, and information processing speed, while crystallized intelligence more reflects experience-based knowledge and the ability to access it and is e.g.measured by vocabulary and decoding tasks [15][16][17][18].Shokri-Kojori et al. [19] compared age-related variance between younger and older adults (100 subjects) for gray matter (GM) and white matter (WM) tissue-specific age scores.They found that the WM age score accounted for significantly more variance in chronological age and was negatively associated with crystalized intelligence in older adults.Go ´ngora et al. [20] reconstructed 10 tracts by deterministic tractography in 83 healthy individuals from the Cuban Human Brain Mapping Project.Their results showed predictive effects of the forceps minor tract on crystallized intelligence and of the superior longitudinal fasciculus on fluid intelligence.
In this study, we explored how well different intelligence measures and age of cognitively normal adult subjects can be predicted from the structural connectome as quantified by diffusion-weighted imaging (DWI).Specifically, we processed DWI data of HCP young adult (1048 subjects) and ADNI-3 (94 cognitive normal elderly subjects) datasets to reconstruct whole brain structural connectome features by our in-house tool, NICARA.Then, we used NICARA extracted features to apply the correlation-based regression (CBR) machine learning method [21] to predict age as well as total, fluid and crystallized intelligence.A similar approach was also suggested by Shen et al. [22].To further explore the predictive capabilityof the CBR ML model and allow for additional statistical comparisons, we also included the structural connectome features of the HCP dataset preprocessed by braingraph.orgavailable with different parcellations.

Ethics statement
According to national law and institutional rules research involving the analysis of existing data, where the data is either already publicly available or will be analyzed such that individual subjects cannot be identified is exempt from IRB oversight.

Datasets
We investigated the prediction capability for different intelligence measures and age of two different DTI pipelines (NICARA [23] and braingraph.org[24]) based on structural connectivity data of 1048 subjects from the Human Connectome Project (HCP) young adult study [25].Investigated features were age, total intelligence, fluid intelligence, and crystallized intelligence (Table 1).The intelligence measures were unadjusted cognitive function composite score, fluid cognition composite score, and crystallized cognition composite score, respectively, based on the NIH toolbox [15].
The braingraph.org database was constructed by deterministic ROI-based fiber tracking (10 × averaged) of 1064 healthy subjects from the HCP Young Adult dataset.It was available with five different sets of ROIs (86, 129, 234, 463 and 1015 ROIs) and therefore also provides a good opportunity to investigate the influence of the parcellation method on the prediction outcome.
The full HCP dataset consisted of 1065 subjects, and we excluded subjects for whom no data was available from braingraph.org (2 subjects) and for whom not all investigated features were available (15 subjects).After this, 1048 subjects (S1 Table ) remained for the analysis (484 male, 564 female).
Since the age range of subjects from the HCP Young Adult cohort is relatively small (22-37 years with a mean of 28.75 years and a standard deviation of 3.68 years; Table 1), we decided to additionally include a second cohort with adult and aged healthy subjects from the ADNI-3 study of the Alzheimer's Disease Neuroimaging Initiative [26].We used this second cohort (S2

Data processing
We processed structural connectomes of 1048 subjects from the HCP young adult study as well as the 94 ADNI subjects.In-house processing pipelines used 379 ROIs with 360 cortical ROIs (HCP-MMP1.0,Human Connectome Project Multi-Modal Parcellation version 1.0 atlas [27] and 19 subcortical ROIs (Harvard-Oxford subcortical atlas [28][29][30]), and 50 million streamlines for the probabilistic whole brain tractography.As comparison datasets to better assess the influence of different parcellations, we downloaded preprocessed connectivity data derived from the same HCP subjects available from the braingraph.orgdatabase [24] that have been obtained with different processing routines.braingraph.orgdatasets were available with five different numbers of ROIs (86, 129, 234, 463 and 1015 ROIs).
In our in-house processing pipeline, at first the T1w images get defaced using SPM12 (https://www.fil.ion.ucl.ac.uk/spm/).The T1w and DTI images were co-registered, and the images in native space were transformed into the MNI152 space by normalization.Tissuebased segmentation was performed using the CAT12 toolbox [31].Defaced T1w images got skull-stripped using the adaptive probability region-growing (APRG) approach and used as input for FreeSurfer [32], which was used to apply cortical & subcortical parcellation in order to project the HCP MMP 1.0 and Harvard-Oxford subcortical atlas to native space via "fsaverage".Noise and distortion correction methods were applied to the DWI images using mrtrix3 [33], FSL [34], and ANT (Advanced Normalization Tools) [35].Anatomically Constrained Tractography (ACT) was applied using mrtrix3 with 50 million streamlines yielding the 379 x 379 connectivity matrices used for the analysis.All processing steps were assembled to a standalone pipeline optimized for automated execution (referred to as NICARA or in-house pipeline) and run by our proprietary neuroimaging solution NICARA Version 2.0, Labvantage-Biomax GmbH, Planegg Germany (https://nicara.eu).

Feature prediction
In order to predict features based on the structural connectivity matrices, we applied the correlation-based regression algorithm proposed by Han et al. [21].
Given n subjects and m edges (depending on the brain parcellation), we first obtained a correlation vector R of size m from the correlation between the matrix A of unnormalized connection strengths of size n x m (containing the vectorized connectivity matrices per subject as rows) and the attribute of interest (Eq 1).The correlation vector R contains the Pearson's correlation coefficients r (Eq 2) between A i (with A i being the ith column of matrix A) and b, a vector of length n containing the values of the attribute of interest (e.g.age) for each subject.

R i ¼ rðA i ; bÞ ð1Þ
ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Subsequently, we calculated the vector of predictor values S of size n for the regression analysis by summing up each subject's connection strengths weighted by the respective correlation coefficient of the edge (Eq 3).
The predictor value S j for subject j is the scalar product of the vectors A j and R x .
Based on the predictor values we fitted a simple linear least-squares regression model that was then used for predicting the feature value of interest.To assess and compare the prediction quality, a 10-fold cross-validation was performed for each dataset.R was calculated from the training data only and the Pearson's r between predicted and actual values as well as the mean absolute error (Eq 4) and range-normalized mean absolute error were calculated from the ten iterations of the cross-validation.
The mean absolute error (MAE) is the norm of the difference of predicted values g and the real values h.The range-normalized mean absolute error (NMAE) can be obtained by dividing the MAE by the difference of the maximal and minimal value of the data and therefore allows for a better comparability between different datasets (unpaired data).

Statistical analysis
Statistical analysis was performed based on the paired absolute errors of subjects obtained from the cross-validation using the Wilcoxon signed-rank test.Comparisons were done between the different datasets of the same feature to determine differences in the as well as within each dataset between crystallized and fluid intelligence.We applied the Benjamini-Hochberg multiple testing correction with a false discovery rate (FDR) of 0.05 to the p-values of each analysis.

Results
We applied the correlation-based regression algorithm (S1 File) to predict age for different HCP datasets comprising 1048 subjects (NICARA, braingraph.orgdatasets) and a subset of the ADNI study comprising 94 subjects, as well as different intelligence measures in case of the HCP datasets.Group comparisons were performed based on the mean absolute error (Eq 4) as well as the Pearson correlation coefficient between the actual and predicted values based on a regression analysis of the total data as well as gender-specific subgroups.Then we applied the Wilcoxon signed-rank test for age, total, crystallized and fluid intelligence to identify differences in the paired absolute errors over the different pipeline conditions.Finally, Wilcoxon signed-rank tests of the paired absolute errors revealed significant differences between crystallized and fluid intelligence in all comparisons.

Age prediction
For the HCP datasets, the maximal observed Pearson correlation coefficients between the features and age per dataset ranged from 0.105 for the 129 node braingraph dataset to 0.227 for the in-house pipeline (S3 Table ), while the minimal Pearson coefficients ranged from -0.104 for the in-house pipeline to -0.168 for the 86 and 129 node braingraph datasets (S4 Table ).The maximum for the ADNI dataset was r = 0.619 and the minimum r = -0.550,therefore stronger compared to the values observed in the HCP datasets.
Our in-house pipeline achieved the highest Pearson correlation between actual and predicted age values (r = 0.21) and the lowest mean absolute error (MAE = 3.02) out of the datasets (Table 3; Fig 2).Within the braingraph datasets, the MAE slightly increased with the number of ROIs (3.02 to 3.13).Based on the Wilcoxon signed-rank test for the paired absolute errors of the full datasets, the difference between the NICARA dataset and the following braingraph datasets was significant before a multiple testing correction: braingraph 129 ROIs (p = 0.0251), braingraph 234 ROIs (p = 0.0245), braingraph 463 ROIs (p = 0.0127) and braingraph 1015 ROIs (p = 0.0085).Within the braingraph datasets, the following were significant: 86 vs 1015 ROIs (p = 0.0350), 234 vs 463 ROIs (p = 0.0478) and 234 vs 1015 ROIs (p = 0.0251).However, after applying the Benjamini-Hochberg multiple testing correction (FDR = 0.05), none of the p-values remained significant.In the gender-specific subgroups, the following two datasets showed a significant p-value for males before but not after multiple testing correction: 86 vs 129 ROIs (p = 0.035) and 86 vs 234 ROIs (p = 0.006).
The prediction result for the ADNI data (Fig 3) achieved a higher correlation value between predicted and real values (r = 0.57) compared to the HCP datasets (r max = 0.22) and a lower NMAE (0.14) compared to the best value out of the HCP datasets (0.20).Dividing the data into gender-specific groups slightly improved the prediction quality for the ADNI data in case of females (r F = 0.68, NMAE F = 0.14), but slightly decreased it for the male-only subjects (r M = 0.52, NMAE M = 0.17; Table 4).For the HCP data, gender-specific analysis led to inconsistent results and mostly worse prediction outcome when compared to the total dataset, but the female subgroup tended to perform better than the male subgroup.
The prediction for the different intelligence measures performed similarly between all available datasets.Data from the in-house pipeline showed the lowest MAE for total (11.47) and fluid (9.36) intelligence while the 129 ROI braingraph dataset had the lowest MAE (7.67) in case of crystallized intelligence (Tables 5-7).
The braingraph dataset with 1015 ROIs showed the highest correlation values for all intelligence measures for all subjects with r = 0.24 (Tables 5-7; Figs 4-6).Within the braingraph datasets, increasing ROI number showed the tendency to increase the MAE but also slightly the correlation between actual and predicted values.For total intelligence, only the absolute  For the comparisons of absolute errors in the gender-specific subgroups for total intelligence, 129 vs 234 ROI braingraph (p = 0.045) was significant in the male subgroup and 86 vs 463 ROI (p = 0.0362), 86 vs 1015 ROI (p = 0.0128), 129 vs 1015 ROI (p = 0.0223), and 234 vs 1015 ROI (p = 0.0231) in the female subgroup before but not after multiple testing correction.For crystallized intelligence, 129 vs 234 ROI (p = 0.0417) showed significance in the male subgroup and the following in the female subgroup: NICARA vs 234 ROI (p = 0.0253), NICARA vs 463 ROI (p = 0.0252), and NICARA vs 1015 ROI (p = 0.0262).None of the p-values remained significant after multiple testing correction.For fluid intelligence in the male subgroup, the following comparisons were significant before multiple testing correction: NICARA vs 234 ROI (p = 0.0285), 86 vs 129 ROI (p = 0.0357), 86 vs 234 ROI (p = 0.0178), 129 vs 1015  Since the cognition fluid and crystallized composite score have the same range and we also applied the Wilcoxon signed-rank test to the absolute error differences between them.The differences were significant for all datasets (Table 8) and all of them remained significant after a Benjamini-Hochberg multiple testing correction (FDR = 0.05).
Gender-specific analysis led to a worse prediction outcome for males in case of total intelligence for all datasets compared to the results for all subjects.For females, the outcome improved for all datasets except for the 1015 ROI dataset based on the NMAE, but the correlation value only improved for the NICARA dataset compared to the total data.For crystallized and fluid intelligence, gender-specific analysis did not improve the outcome.In case of fluid intelligence, the correlation values between actual and predicted values for the male subgroup in the braingraph datasets were not even significant anymore (Table 7).

Discussion
In this study, we investigated the predictability of age and intelligence measures based on whole brain structural connectome measurement under various conditions of imaging processing and gender.First, we demonstrated that NICARA processing combined with our machine learning approach could well predict age of ADNI subjects but performed worse for HCP subjects consistent with braingraph.orgresults.Second, crystallized intelligence was better predicted than fluid intelligence in general.Finally, we found two interesting trends of gender effects for fluid and crystallized intelligence predictability that did not remain significant after multiple testing correction.
Using a simple machine learning approach based on whole brain structural connectivity only we were able to decently predict age for cognitively normal ADNI-3 control subjects (N = 94) with a distinct age range (r = 0.57, NMAE = 0.14).Our finding is similar to that of recent studies both in MAE and std of age prediction using structural connectivity data of ADNI [36,37], although both studies employed different workflows to extract structural connectivity features and different prediction methods.
However, the prediction quality for age was worse in case of the HCP young adult dataset (r = 0.21, NMAE = 0.20), which only covered a narrow range of young subjects, for both investigated pipelines (in-house and braingraph.org)under different parcellation methods despite its large size (N = 1048).Multi-factor characteristics of HCP Young Adult dataset structural connectivity could account for limited predictability of age.This effect of the HCP dataset is evident by comparable age prediction quality from distinct processing approaches of NICARA versus braingraph.org.
For subjects of the HCP Young Adult study, total, fluid and crystallized intelligence values were measured as cognition composite scores.The overall low predictability of intelligence by whole brain structural connectome features confirmed the statement of Wu and colleagues' work [38] that their prediction combining cortical and subcortical surfaces together yielded the highest accuracy of fluid intelligence for both ABCD (N = 8070, r = 0.314) and HCP datasets (N = 1097, r = 0.454), outperforming the state-of-the-art prediction of fluid intelligence from any other brain measures in the literature.Wu and colleagues developed a novel graph convolutional neural networks (gCNNs) for the analysis of localized anatomic shape and  prediction of fluid intelligence.Kra ¨mer et al. [39] reported a similar outcome to this study for a small trend for a multimodal benefit therefore concluded that developing a biomarker for cognitive aging remained challenging.Their study employed multimodal information, i.e., region-wise grey matter volume (GMV), resting-state functional connectivity (RSFC), and structural connectivity (SC), and generalized results across different ML approaches in 594 healthy older adults (age range: 55-85 years) from the 1000BRAINS dataset.Predictability of crystallized intelligence was better than that of fluid intelligence in all investigated datasets based on the Wilcoxon signed-rank test applied to the paired absolute errors (Table 8), which may imply a stronger relation between crystallized intelligence and whole brain white matter probabilistic tractography based connectivity than that of fluid intelligence.Similar findings have been published by investigating distinct multi-region neuroanatomical patterns extracted from grey matter surface as well as volumetric assessments in 1089 HCP subjects by employing an elastic net regression model [40].Our finding is also consistent with another study that investigated 415 HCP subjects by a similar tractography method but using only 86 ROIs from FreeSurfer [41].A much finer atlas parcellation with 439 ROIs created in that study did not show a significant difference in predictability between crystallized and fluid intelligence [41].Other papers considered highly correlated neuroanatomical morphometry profiles, i.e. cortical surface area, and the environmental impact on the relevant neuroanatomical morphometry as explanations for better predictability of crystallized intelligence compared to fluid intelligence [40,42].
Interestingly, we find tendencies of slightly better prediction of intelligence in NICARA by gender-specific subgroup analysis based on the range-normalized mean absolute errors and pearson correlation between actual and predicted values.This supports the finding that gender specific factors [43][44][45][46][47][48] may affect connectivity as well as the relationship between connectomics and cognition [11].
With the publicly available braingraph.org dataset, connectome data from a second structural connectome extraction method for the same HCP subjects was available with five different parcellations (86, 129, 234, 463 and 1015 ROIs).Within the braingraph datasets, the MAE showed the tendency to increase with a higher ROI number.This may be due to the parcellation method itself, or the fact that the resulting connectivity matrix might be too sparse using only 1 million streamlines for finer parcellation [41,[49][50][51].However, the differences between the different braingraph datasets were not significant after correcting the p-values obtained by Wilcoxon signed-rank tests between the paired absolute errors for multiple testing using the Benjamini-Hochberg method (FDR = 0.05).

Limitations
An obvious shortcoming of the HCP Young Adult subjects for prediction of age and intelligence are the biases in the data distribution.The data only contained young subjects (aged 22-37 years), and the cognition composite scores are clearly biased towards higher values.100 is the United States average but the mean values for the three scores of the 1048 subjects were all higher than that (Table 1).We could show the influence of the age range by also applying the approach to control subjects from the ADNI-3 study with an age range about twice as large (Δ = 35.0years vs. Δ = 15 years) resulting in a better prediction performance (reflected by a lower NMAE and a higher correlation between actual and predicted values), even though these subjects were also biased as they contained elderly individuals only (56.5-91.5 years).
Apart from the bias, a possible explanation for the low predictability of intelligence in this study can be seen in the structural correlates of intelligent behavior.Human intelligence is thought to be associated with physiological and morphological properties of cortical pyramidal neurons [52], which, of course, can only be captured indirectly by whole brain DTI-based fiber tracking.

Future approaches
To certify the age prediction, it would be favorable to apply the proposed approach to a dataset with a wider age range such as the Lifespan Human Connectome Project in Aging [53] once it is complete and available.Moreover, it has been previously discussed whether sets of connections would uniquely map onto cognitive function [54].Further work could elaborate on this idea to test whether specific functional brain networks [55] and their connectedness are more predictive than individual or sets of connections alone.From studies with multiple sclerosis patients, it is known that graph theoretical measures are better descriptors of cognitive decline [56,57] than the strengths of individual connections.For future studies on prediction qualities of the connectome for cognitive measures it would therefore be favorable to include graph theoretical measures [58] as well.Combining structural and functional brain networks for intelligence prediction revealed so far ambivalent results [40].Here is certainly more work needed, for example, by using brain multiplex networks [59] to increase predictive power of multimodal connectomes.Being able to predict brain age and intelligence from brain connectivity has a large impact on monitoring disease progression in dementia or other brain diseases associated with cognitive decline, for example, multiple sclerosis.

Conclusion
In conclusion, this study explores the predictability of age and intelligence in cognitive normal subjects from the HCP and ADNI datasets by means of whole brain tractography-based structural connectome applying a simple and easy-to-use established machine-learning method.To our knowledge, this is the first study focusing on a single neuroimaging modality feature to predict age and intelligence from whole-brain tractography.The good predictability of age in ADNI and the finding that crystallized intelligence was better predictable than fluid intelligence in HCP datasets was possible by the combination of the NICARA structural connectome pipeline and a simple machine learning model.Therefore, we believe that such a combination could provide a reliable framework option to further narrow down the gap in prediction between neuroimaging features and subjects' cognition as well as other biological features.

Fig 2 .
Fig 2. Prediction results for the HCP dataset (age) for NICARA and the 1015 ROI dataset from braingraph.org.The x-axis shows the actual age in years and the y-axis the predicted age in years during the cross-validation.https://doi.org/10.1371/journal.pone.0301599.g002

Fig 3 .
Fig 3. Prediction results for the ADNI dataset (age) for the total dataset (bottom) and the gender-specific subsets (top).The x-axis shows the actual age in years and the y-axis the predicted age in years during the cross-validation.https://doi.org/10.1371/journal.pone.0301599.g003

Fig 4 .Fig 5 .
Fig 4. Prediction results for the HCP dataset (total intelligence) for NICARA and the 1015 ROI dataset from braingraph.org.The x-axis shows the actual values and the y-axis the predicted values during the cross-validation.https://doi.org/10.1371/journal.pone.0301599.g004

Fig 6 .
Fig 6.Prediction results for the HCP dataset (crystallized intelligence) for NICARA and the 1015 ROI dataset from braingraph.org.The x-axis shows the actual values and the y-axis the predicted values during the cross-validation.https://doi.org/10.1371/journal.pone.0301599.g006

Fig 8 .
Fig 8. MAE of fluid intelligence prediction.Total dataset: green, Female subset: blue, Male subset: red.*: p < 0.05 without correction based on the Wilcoxon signed-rank test for paired absolute errors.https://doi.org/10.1371/journal.pone.0301599.g008 Table) with 94 control subjects (40 male, 54 female) with an age range from 56.5 to 91.5 years (mean: 74.43 years, standard deviation: 7.86 years; Table 2) to confirm age predictability from connectomes extracted by our in-house pipeline.Fig 1 shows the age distribution of male and female subjects from both studies.

Table 3 . Prediction outcome using the correlation-based regression method for age.
absolute error (MAE) obtained from a 10-fold cross-validation as well as the range-normalized MAE (NMAE).The highest observed correlation and lowest MAE for each group is highlighted in bold.T: All subjects (total), M: Male subjects only, F: Female subjects only.https://doi.org/10.1371/journal.pone.0301599.t003

Table 8 . P-values from the comparison between crystallized and fluid intelligence.
from the Wilcoxon signed-rank test between the absolute errors from crystallized and fluid intelligence prediction for each dataset.All p-values remained significant after the Benjamini-Hochberg multiple testing correction (FDR = 0.05).